Goto

Collaborating Authors

 sketch size





Cost

Neural Information Processing Systems

We then provide an algorithm to efficiently sketch input data using such embeddings. The sketch size of the embedding used in the algorithm has a linear dependence on the number of sketching dimensions of the input.


Communication-efficientDistributedSGDwith Sketching

Neural Information Processing Systems

However,theoretical and empirical evidence both suggest that there is a maximum mini-batch size beyond which the number of iterations required toconvergestops decreasing, andgeneralization error begins toincrease [Maetal.,2017,Lietal., 2014, Golmant et al., 2018, Shallue et al., 2018, Keskar et al., 2016, Hoffer et al., 2017]. In this paper, we aim instead to decrease the communication cost per worker.